Class 4 Assignment | Thematic Map Design, Part II

North America Thematic Maps

Source: https://www.davidrumsey.com/luna/servlet/detail/RUMSEY~8~1~243578~5513495:North-America-Thematic-Maps-

Concepts & Themes:

This week’s assignment will encompass the following concepts covered in Class 4 lecture & lab:

  • Thematic mapping purposes & methods
  • Ecological fallacies & thematic design pitfalls
  • Choropleth mapping
  • Classification techniques
  • Areal unit and population normalization

Specific techniques covered this week will include:

  • Thematic techniques for effective map display
  • Tabular data joins
  • Normalization
  • Thematic map outputs
  • Histograms & statistics

Class 4 Readings:

The Class 4 quiz (02/19/2023 - Sunday) will features 10 questions covering content in the textbook Chapter 4 and Chapter 5 as noted below:

  • Essentials of Geographic Information Systems textbook - Chapter 4, Section 4.4 Cartographic Data Classification - pages 85 - 90.


  • Essentials of Geographic Information Systems textbook - Chapter 5, Section 5.1 Geographic Data Acquisition - pages 91 - 93.
    • Read Section Attribute Data Types.
    • Read Section Measurement Scale.

Supplemental readings: SQL cheatsheet (Not necessary for Class 4 assignment; good for general reference) Normalizing Census Data Normalizing Census Data Formulas

Below are links to further readings for those interested in the United States political dimension of ‘MAUP’, Gerrymandering and a divided US electorate:

Class 4 Demonstration Lab:

  • Topics Covered:
    • Tabular joins
    • Saving joined data
    • Classification methods
    • Color ramps and class breaks
    • Data dictionaries
    • Basic QGIS statistical tools
  • Class 4 Demonstration Lab

Class 4 Assignment Steps:

  • For this assignment submission, utilize the data process from the Class 4 Demonstration Lab to get started.

  • General concepts covered in Class 4 Assignment - Thematic Mapping with US Census Data :

    1. Connecting data sources - ESRI .gdb
    2. OTF - ‘on the fly’ projections for project file (not layers per se)
    3. Table joins
    4. Data dictionary
    5. Normalization of populations
    6. Creating geometry for features (in preparation of sq. area normalizations)
    7. Exporting/saving joined data
    8. Classification methods for mapped census data
    9. Basic statistics for census theme

Note: this week will feature a CRS ‘on the fly projection’ transformation to achieve a better map ‘shape’ for the required assignment map. The image below shows an equal area projection vs. the default WGS84 projection. The image source is linked to an article that discusses these differences further:

Map Project Options Suitable for US Contiguous States

Source: https://source.opennews.org/articles/choosing-right-map-projection/

  • In this assignment, one thematic map for Contiguous United States based on a demographic variable derived from the American Community Survey 2020 data ACS_2020_5YR_County (5-year estimates from the 2016-2020 ACS) will be developed. This will require a two-step process: access and prepare the data, followed by a classified mapping of the data.

  • The top grade for this map will be 90 points. There is an extra credit component for the additional 10 points. The video guide towards end of assignment will guide you through an example for the extra credit component using the age+sex table.

Note: the video features older ACS data, but the process will be very similar to load, connect and map census data.

The videos will not cover the cartographic design elements covered in previous classes; make sure to use all the cartography skills you learned to enhance this week’s deliverable - legend, inserts (consider including Alaska and Hawaii and Puerto Rico as map inserts - an example HERE), map titling and data sourcing. Scale bars and north arrows are often not essential for effective thematic mapping- use them sparingly, if at all, when producing thematic maps.

Note: including map insets for Hawaii, Alaska, Puerto Rico and other county census geographies not in the lower 48 is not required of this assignment - only if you prefer to add them to your map layout.

  • General process outline:
  1. Navigate to data and download bundled product TIGER/Line with Selected Demographic and Economic Data via the following link:

US Census ACS .gdb product

  1. Utilize American Community Survey 5-Year Estimates — Geodatabase Format 2016-2020

  2. Select County > Download Data:

Download Census Data

  1. Unzip the resulting product ACS_2020_5YR_COUNTY.gdb.zip. make sure to keep the resulting directory with folder extension intact: ACS_2020_5YR_COUNTY.gdb:

.gdb format

  1. At this juncture, all the tables as well as geometry that are located in the .gdb are available for import into QGIS. However, we only need the geometry for the county boundaries plus one theme - for this mapping, we will use the Total Population from the Age and Sex theme.

Note that this is an estimate of the total population, but for our purposes a good approximation. The census derives total population estimates within the ACS product via a process known as Imputation.

With population counts secured, we will then derive a population density estimate per county in the final mapping. This process is Normalization by Geography; that is, we are using population counts and normalizing those by county areal size. This is a different normalization process than Normalization by Population inside a census theme.

We need to know which table to import. To do this, we utilize an online html document - ACS variables 2020 that lists all the acs tables and their themes. There are over 64,000 variables in this dataset, so isolating the correct table is an essential first step.

ACS Variables and their Concepts listed via API

source: https://api.census.gov/data/2020/acs/acs5/variables.html

Alternatively, we can import the COUNTY_METADATA_2020 tabular data and view within QGIS.

COUNTY_METADATA_2020

  1. For the assignment mapping, we will use the SEX BY AGE table B01001. Here we will use just the first variable, which is the total population estimate per county - B01001e1:

    • B01001e1|SEX BY AGE - Universe: Total population - Total: -- (Estimate)

Census Theme in QGIS Attribute Table

  1. To proceed, Point QGIS to .gdb to begin mapping. We will discuss the .gdb format during Class 4 Lecture and Lab. Further, see video references at end of assignment for .gdb imports to QGIS.

Point QGIS to the .gdb

Select both the geometry and table B01001 - SEX BY AGE. This is also referred to as the X01_AGE_AND_SEX table:

Select Geometry + Tabular Data

Connect to .gdb - feature + tabular data

  1. Next, the tabular data for AGE AND SEX will be exported as .csv outward from the .gdb structure. As this is done, the table will be ‘thinned’ to the just those variables needed for the mapping - the OBJECTID, GEOID and B01001_001E alone. Save the export as acs.2020.population.csv into the assignment project folder directory. Also state No geometry as geometry type:

Table Structure 10. Next, import acs.2020.population.csv as delimited text:

Delimited Text as import file type

  1. To gain, population density, we will use the total population divided by total square area of each county. The unit we will use will be square miles resulting in ‘Persons per Sq. Mile, per County’. We can either create the area of each county using the Field Calculator; or simply use the ALAND variable in the dataset which equates to square meters for each county. To calculate area units - ALAND as Square Miles - the following calculation is used:
  • 1 sq. mile = 2,589,988.110336 meters
  • ALAND/2589988.110336
  1. To proceed, first export the geometry out from the .gdb to a .shp and title acs.geo.shp. Make sure NOT to change the coordinate system which is NAD83 - EPSG:4269:

.shp Export

  1. Next, use the Step 6 calculations above to create a new field sq.mile within the acs.geo.shp, not the tabular data.

Field Calculator

  1. Next, a table join will be enacted between the geometry acs.geo and the tabular data acs.2020.population. As is, these two data files exist side-by-side in the QGIS project. We need to ‘join’ these two files based on a common attribute. This is known as a ‘table join’. To start, save the project to update to the current data files.

  2. Next, preview the two attribute tables to determine the attribute join. In this case both contain the critical GEOID that is the US Census unique identifier across all census geographies:

Table Join Preparation

  • asc.geo = GEOID_DATA

Table Join Preparation

  • asc.2020.population = GEOID
  1. Inside asc.geo, navigate to Properties > Joins > green plus button and populate as follows:

Table Join Preparation

  1. A successful table join will result in the new population variable from the asc.2020.population now joined correctly to the asc.geo layer (far right field in image below). This table join is currently loaded in temporary memory. It must now be exported as a new .shp before proceeding:

Table Join Preparation

  1. While exporting, we will drop fields not needed. We only need the name, GEOID, sq. miles and the population count:
  • name export acs.2.map.shp:

.shp Export

  • Toogle ON fields to export:

.shp Export

  • Resulting feature:

.shp Export - Result

  1. Next, the population count will be normalized to the sq. miles in the acs.2.map.shp feature. Create a new field via the Field Calculator and populate as follows. Proceed to Toogle editing OFF and save the new field:

Population Density Calculation in Field Calculator

Note: a Whole Number (integer) field type is selected as persons can only be whole numbers, not decimal numbers, i.e. there are only whole persons, not partial persons. There are approximately 30 counties that contain less than 1 person per sq. mile. These counties will simply receive a 0 and will be classed accordingly in the final thematic map.

  1. With Data Preparation complete, move to thematic mapping and data classification. To start, navigate to symbology - Properties > Symbology > Graduated > Select Value pop.den > Classify button at bottom:

Graduated Symbology - Equal Count - Quantile

  1. Population density variable towards the lowest class:

Cartographic Result 22. Rerun the classification using quantile method, results in a much more ‘balanced’ map across the 5 breaks, while still retaining the low population geographies of the Natural Breaks method:

Graduated Symbology - Natural Breaks

Cartographic Result

  1. Finally, change the project map CRS to North America Albers Equal Area Conic. This will give the final map better areal representation - an ingredient that is important for choropleth mapping:

Transform the project CRS

Better areal representation with Albers CRS projection

  1. Continue to final map layout and design. Utilize the Map Example for the layout items to be included in this assignment submission:

Assignment Deliverable:

Produce final Map layout and design. Output as PDF 300 DPI 8.5”x11” or 11”x17” (use .png or .tiff if PDF at 300 DPI produces too large file size export). If pursuing the extra map II, follow the guidelines provided below, and again, produce map layout, design and output similar to the main map assignment.

Class 4 Extra Credit Mapping:

You are strongly encouraged to pursue the extra credit portion of the assignment for a top potential score of 100 points. While the required map above will feature basic demographic data at the county level, the extra credit map will feature a more tailored exploration of census data. In this extra credit mapping, you will utilize the same data source bundled US Census ACS product. Instead of normalizing the data by areal units (population density per sq. miles), you will normalize the census theme by the theme universe population per US county.

The equation for this population normalization: census theme count/census universe population*100

Like the main assignment, you can utilize the bundled .gdb format for the 2020 version of the ACS 5-year survey.

Video Guide:

Assignment 4 Extra Credit - Thematic Mapping - normalization + classification methods:

video

Note: the video guide uses ACS of a prior vintage. This should not impact generally the methdology of the assignment shown in the video.

Further Reference:

US Census Links:

Online tools & utilities to aid thematic map design:

Helpful articles and resources for census data and thematic mapping techniques:

Case Study - The Marshall Project:

  • The Marshall Project

  • The Marshall Project extracted the number of adults in correctional facilities per county from the 2000, 2010 and 2020 Decennial Census.

U.S. County distribution of Incarcerated Populations

  • The data can be downloaded HERE and HERE

Data Fields in the Dataset